Heuristically guided constraint satisfaction for AI planning
نویسنده
چکیده
Constraint satisfaction techniques have been used in Artificial Intelligence planning for many years. As with the original planning problem formulation, the CSP model’s complexity can lead to a very large search space for all but the simplest of problems. Without additional guidance, the CSP solver will rely on standard CSP heuristics and search methods. Better use can be made of the CSP framework if the search is informed by the structure of the planning problem. This paper discusses a goal-centric, variable / value selection heuristic method of guiding the search for a solution to a constraint encoding of classical planning problems. Also, meta-CSP methods that provide further propagation are introduced. The prototype uses an extensional encoding of the problem, goal ordering, the variable / value ordering heuristic and the metaCSP variables. Preliminary results on a number of test domains are presented. These show an improvement over the same encoding without heuristic guidance. Introduction Artificial Intelligence (AI) Planning (Russell et al. 2003) seeks to find a set of actions that transform a given initial state into a partially specified goal state. The Planning Domain Description Language (PDDL) (Ghallab et al. 1998) is the standard representation used by the research community to describe planning problems. An alternative approach is to reformulate the problem as a Constraint Satisfaction Problem (CSP) (Dechter 2003). Using a CSP formulation allows planning problems to take advantage of Constraint Programming (CP) techniques, including enhanced propagation machinery and better pruning mechanisms. These are techniques not fully exploited by the planning community. Instead, many successful contemporary planners1 rely on the use of heuristics based on a relaxed version of the original problem. Noting that CSP based planners do not yet match the performance of their state of the art counterparts raises the question: Why not? Firstly, for most interesting problems, the size of the search space can become extremely large, growing exponentially with the problem size. Secondly, with a large problem comes an increased plan length (horizon). IPC-2011:http://www.plg.inf.uc3m.es/ipc2011deterministic/Results?action=AttachFiledo= viewtarget=ipc2011booklet.pdf This large horizon greatly reduces the impact of the propagation resulting from the goal-state constraints. That is, with a small horizon, a CSP solver benefits from inferences made as a result of the constraints placed on the goal-state variables. These inferred values result in new constraints which, in turn, provide further pruning of the search space. In this manner, in an ideal situation, the power of the declarative CSP paradigm becomes clear; a solution can be inferred without the use of search. It has been shown (Vidal and Geffner 2006) that it is possible to solve simple planning problems using inference alone. The authors describe a new version of the partial-order CPT planner, eCPT, that makes use of landmarks (Porteous and Sebastia 2004) and additional distance constraints (van Beek and Chen 1999). This paper presents a technique with a similar theme: Increased use of inference, with much less reliance on uninformed search. The aim is to use a given planning problem’s structure to inform the variable and value selection method in the CSP encoding of that problem. This is a goalcentric approach which attempts to satisfy each goal variable’s value as early in the plan as possible. By doing so, a series of intermediate horizons are introduced. These provide a bridge to the extra propagation required to find a faster solution. Additionally, in order to maintain the value of newly achieved goals, this work introduces a meta-CSP approach to locking the relevant variable’s value until the end of the plan. Here, too, extra propagation is the result. Application of this algorithm to extensional CSP encodings of a number of planning problems has shown promising results for the production of sub-optimal plans. The remainder of this paper is structured as follows. Firstly, a short introduction to planning as CSP is presented. The technical details of the goal-directed heuristic algorithm and the meta-CSP locking technique are then described, together with details of the prototype system used for testing. Preliminary results of the algorithm’s implementation are shown, and finally conclusions are drawn, with the direction of future work indicated. Planning as Constraint Satisfaction Planning is a human decision making process which seeks to achieve a given outcome by using a set of predictable operations in sequence. AI planning attempts to automate this process. Informally, we can state that automated planning is a sequential decision making technique, the purpose of which is to provide a set of actions that transform a fully specified initial state into a given goal state. In order to manage the complexity of the real world, AI planning restricts the information described and abstracts much of the unnecessary detail. Classical planning, also known as STRIPS planning (Fikes and Nilsson 1990), requires that the state space be finite and fully observable. It is assumed that only specified actions can change a state and that they do so instantaneously, with the resulting state being predictable. Definition 1. A STRIPS planning problem, P = (O,I,G), where O is a set of operators, I is a conjunction of fact literals describing the initial state, and G another conjuction of facts describing a partially specified goal state. Alternative solution techniques can be used if the planning problem is reformulated. One such approach sees the planning problem cast as a CSP. Definition 2. A constraint satisfaction problem, M, is a triple, (X,D,C), where X is a finite set of variables, X = {x1, x2, . . . xn}, D is a finite set of domains for those variables, D = {d1, d2, . . . dn}, and C is a set of constraints, C = {c1, c2, . . . cm} where each constraint defines a predicate which is a relation over a particular subset of X. As with all software engineering, eliciting the requirements and formulating a full and correct problem specification are central to being able to find a solution to the original problem. When using CSPs for planning, we already have the problem specification, generally in PDDL. However, there are many ways in which to model the problem as a CSP, and it is well known that the modelling choices made are critical (Freuder 1999). It has been shown (Beacham et al. 2001) that the choices of model, search algorithm and heuristic interact and none of these decisions should be made independently. In this work, the first step towards modelling the problem was to convert from PDDL into a representation based on SAS (Backstrom 1992). By using intermediate output from the Fast Downward planner (Helmert 2006), it was possible to gain access to such a SAS description. Basing the constraint model on this multi-valued representation is good modelling practice (Smith 2005) since it provides a smaller number of variables, each with larger domains. Another important aspect of modelling is the way in which the constraints are expressed. For example, it is possible to use either a series of binary 6= constraints or a global allDifferent constraint. Knowledge of the differing constraint propagation behaviours and costs is essential if best use is to be made of the CSP approach (Harvey, Kelsey, and Petrie 2003). Based on the authors’ empirical experience and on the results of recent research (Bartak and Toropila 2008b), it is clear that an extensional representation of constraints leads to a more efficient model of a given planning problem. For that reason, extensional, or table constraints were used in this work. Finding Solutions to CSPs It is possible to categorise CSP solution methods as being either constructive or local in nature. Local search methods (Clark et al. 1996) generally employ a hill-climbing strategy and are guided by a cost function. Moving over the space of completely instantiated variables by changing, say, one variable at a time (local change), the search stops either when the problem is solved (cost equals zero) or when no further improvement in cost is available (local minimum). This paper describes a system that makes use of a constructive approach (Tsang 1993) and, for that reason, the following discussion focuses on this area. Constructive search attempts to iteratively extend a consistent partial assignment of the variables in the CSP. This continues until a consistent assignment of all variables is made. If the partial assignment proves inconsistent, the algorithm fails and backtracking takes place. The important parts of this procedure are the order in which the variables are selected, the order in which the values are tried on those variables, the method by which variable assignments are propagated, and the means by which the search procedure backtracks. Looking first at the variable selection strategy, the ordering can be static or dynamic. The former specifies the order before search begins while the latter makes a decision based on the current state of the search. An example of a computationally inexpensive variable ordering heuristic is the Minimum Remaining Values (MRV) (Gent et al. 1996), or first fail, method. In this approach, the variable with the lowest number of remaining values is chosen. This results in the branching factor being minimised for the longest possible time. MRV has been shown to work well on a large number of CSPs (Dechter and Meiri 1994). Having chosen a variable to instantiate, it is now necessary to assign a value to that variable. A good choice of value will reduce the amount of backtracking required. It should be noted that, if the correct choice of value is made at each point of the search, it is possible to reach a solution with no backtracks. In contrast to variable ordering, the strategy for value selection is to choose the value that is most likely to succeed, since failure would cause the search to backtrack. An example of a value ordering heuristic is the min-conflict heuristic (Dechter 2003). In this method, the values are ordered based on the number of conflicts that they are involved in with the unassigned variables. With a variable selected and a value assigned, inference can be used to construct new constraints based on this latest assignment. There are many propagation methods available, including forward checking and k-consistency strategies (Freuder 1978). One commonly used technique is arc consistency (Mackworth 1977). Here, the algorithm guarantees that any allowable value in a variable’s domain is consistent with a permitted value in the domain of any other single variable. If the propagation step finds an inconsistency, it is necessary to backtrack. A simple backtracking technique steps back over the last made assignment and tries an alternative value from that variable’s domain. More advanced methods can pinpoint the variable responsible for causing the failure and will backtrack accordingly. These include backmarking, backjumping and conflict-directed backjumping (CBJ) (Prosser 1993). CSP approaches to planning CSP techniques have been used in planning for many years, although early systems made use of constraint methods only to solve part of the problem (Stefik 1981), (Joslin and Pollack 1995) and (Goldman et al. 2000). That is, subproblems were posted and solved as CSPs, with the result returned for use by traditional planning machinery. Systems that completely encode the planning problem as a CSP include CPlan (van Beek and Chen 1999), GP−CSP (Do and Kambhampati 2001) and Csp−Plan (Lopez and Bacchus 2003). The first of these makes use of various types of constraint, including distance constraints and capacity constraints, although these are manually specified by the user. GP−CSP encodes the planning graph as a dynamic CSP, whereas Csp−Plan avoids the planning graph and instead exploits the CSP encoding by adding new constraints and removing single valued variables. Another system (Gregory, Long, and Fox 2007), based on a meta-csp formulation of the planning problem performed well when compared to state of the art SAT (Kautz and Selman 1992) planners. Reformulation of a number of earlier CSP-based encodings using table constraints has been shown (Bartak and Toropila 2008b) to be effective. Also, the inclusion of symmetry breaking, singleton consistency and lifting (Bartak and Toropila 2008a) improves CSP planner performance. Recent CSP planning systems include those based on timelines (Verfaillie and Pralet 2008), (Cesta and Fratini 2008), (Bartak 2011) and one that makes use of dominance constraints (Gregory, Fox, and Long 2010). Technical Details Whilst recasting a planning problem as a CSP does allow CP specific tools to be used, generally the solution process does not make use of the structure of the original planning problem. That is, the planning problem is converted into a CSP and the CSP is solved using CP methods. In order to make better use of the information implicit in the original problem, it is helpful to note some sources of leverage in planning. For example, a time bound will force out useless actions from the plan, thereby aiding progress towards a solution. Similarly, the consumption of a resource may prevent goal achievement, making it necessary to undo a previous assignment. With this in mind, is it possible to employ similar forms of leverage in the CSP paradigm? The following sections describe two techniques that attempt to do so. The first, a goal centric variable and value heuristic, introduces a series of time bounds, allowing for more propagation and the removal of unhelpful actions. The second, a meta-CSP technique, locks goal values once these are achieved. Preliminaries The test configuration used in this work makes use of a heuristic estimate to gain an initial seed plan length. Empirical testing has shown that this generous CSP seed plan length works on all of the test domains. However, further work in this area may improve the accuracy of the estimation and consequently could lead to improved efficiency. Due to the goal centric nature of the following procedure, it is first necessary to order the goals. Such a goal ordering is achieved by breaking all cycles in the original planning problem’s causal graph (Helmert 2004) and sorting topologically. With a seed plan length known, the CSP can be constructed. Referring to Figure 1, the partial solution matrix shows some of the CSP (state) variables (columns 1 to 3) and their associated domains for a small logistics problem example. The final column shows the action variables, also with their domains. At this stage, only the variables and their domains are represented. The next step is to define and apply the constraints. The constraints represent the grounded actions derived from the planning problem. That is, the means by which the variables change value. Hence, the action in action slot one operates on the variables in the initial state to produce row two in the matrix, and so on. 3 5 2 . . . Ac1: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac2: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac3: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac4: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac5: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac6: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac7: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac8: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac9: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac10: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac11: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac12: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac13: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac14: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac15: _{1..109} _{0..8} _{0..8} _{0..5} . . . Ac16: _{1..109}
منابع مشابه
Reformulating Constraint Models for Classical Planning
Constraint satisfaction techniques are commonly used for solving scheduling problems, still they are rare in AI planning. Although there are several attempts to apply constraint satisfaction for solving AI planning problems, these techniques never became predominant in planning; and they never reached the success of, for example, SATbased planners. In this paper we argue that existing constrain...
متن کاملIntroduction to planning, scheduling and constraint satisfaction
Planning, scheduling and constraint satisfaction are important areas in Artificial Intelligence (AI). Many real-world problems are known as AI planning and scheduling problems, where resources must be allocated so as to optimize overall performance objectives. Therefore, solving these problems requires an adequate mixture of planning, scheduling and resource allocation to competing goal activit...
متن کاملSpatio - Temporal Reasoning Using Amulti - Dimensional Tesseralrepresentationf
A versatile and universally applicable quantitative multi-dimensional reasoning mechanism founded on a unique linear tesseral representation of space is described. The reasoning mechanism is based on a constraint satisfaction mechanism supported by a heuristically guided constraint satisfaction approach The mechanism has been incorporated into a spatio-temporal reasoning system, the SPARTA (SPA...
متن کاملSpecial Track on Artificial Intelligence Planning and Scheduling
Planning has belonged to fundamental areas of AI since its beginning and sessions on planning are an integral part of major AI conferences. By generating activities necessary to achieve some goal, planning is also closely related to scheduling that deals with allocation of activities to scarce resources. Although the planning and scheduling communities are somehow separated, both areas have int...
متن کاملApplying Constraint Satisfaction Techniques to Ai Planning Problems
An AI planning problem is one in which an agent capable of perceiving certain states and of performing some actions finds itself in a world, needing to achieve certain goals. A solution to a planning problem is an ordered sequence of actions that, when carried out, will achieve the desired goals. Constraint satisfaction is a general method of problem formulation in which the goal is to find val...
متن کاملEnhancing Constraint Models of Planning Problems by Common Subexpression Elimination
Constraint Programming is an attractive approach for solving AI planning problems by modelling them as Constraint Satisfaction Problems (CSPs). However, formulating effective constraint models of complex planning problems is challenging, and CSPs resulting from standard approaches often require further enhancement to perform well. Common subexpression elimination is a general technique for impr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015